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Abstract: Online intrusion alert aggregation with generative data stream modeling is a approach 
which uses generative modeling. It also use a method called as probabilistic methods. It can be assume 
that instances of an attack is similar as a process may be a random process which is producing alerts. 
This paper aims at collecting and modeling these attacks on some similar parameters, so that attack 
from beginning to completion can be identified. This collected and modeled alerts is given to security 
personnel to estimate conclusion and take relative action. With some data sets, we show that it is easy to 
deduct number of alerts and count of missing meta alerts is also extremely low. 

Also we demonstrate that generation of meta alerts having delay of only few seconds even after 
first alert is produced already. 

Keywords', online intrusion detection system, data stream, alert aggregation, IDS, offline alert 
aggregation, online alert aggregation etc. 

I. Introduction 

In general, IT system is having huge number of information. This information is always confidential. 
Providing security to information is essential task in information technology system. To provide information 
security, emergence of new technologies which are innovative should be happened. 

Intrusion Detection System plays an very important role in information security. It can be a device or a 
software application which is capable to detect outside intrusion as well as monitors inside activities such as 
unauthorized access. It detects suspicious actions by evaluating TCP/IP connections or LOG files. The working 
of this IDS is such a way that when it finds some action which is suspicious action then it produces alerts 
immediately. This alert contains information about source IP address, destination IP address, and possible type 
of attack. This possible type of attack consist of buffer overflow, denial of service, SQL injection etc. This alert 
processing is done at very low level of IDS. So it may be possible that single attack instance can have thousands 
of alerts. It becomes drawback of existing IDS .There are two types of IDS. 

1.1 NIDS: NIDS is nothing but Network Based IDS. This IDS is an independent platform. It analyze the traffic 
on internet. It also monitors many hosts. Network based IDS access network by network tap, network switch, 
network router etc. In network based IDS sensors are placed, which identifies network traffic and analyze the 
content. Snort is the example of Network based IDS. 

1.2 HIDS: HIDS is Host Based Intrusion Detection System. This IDS is may be dependent or independent 
platform. Agents are placed in Host based IDS. This agent in Host based IDS analyze log files, system calls and 
any other activities. Sensors are consists of agents. OSSEC is the example of Host based IDS. 

II. Related Work 

Existing IDS are having very high accuracy to detect the attacks, but still they have some drawbacks 
such as alerts are produces at very low level of IDS, thousands of alerts may produce for single attack instances, 
confusions may happened due to large number of alerts produced in taking appropriate actions while attack is 
done and so on. Many scientist or publication have done their work to remove these drawbacks. They have 
provided some direction to do the future enhancement in IDS. 

The most suitable way to apply the correlation between different alerts is done in[6]. In this paper 
reconstruction of alert thread is done. The alerts which are produced by IDS can be aggregated by using some 
fixed length window. But it can produce duplicates, which should be eliminated for proper working of IDS. So 
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elimination of these types of duplicates is done in [7] by clustering the alerts online as well as offline. First 
offline algorithm has been developed to eliminates the duplicates and this offline algorithm had been extended 
for online algorithm. The situation of current attack is done in[8]. The cluster is used to group same attacks is 
done in [9]. Instead of using alert clustering, another way to correlate alerts is done in [10]. In this paper the 
process of combining two alerts is done on the basis of weighted, attribute wise similar operator. But from [11] 
and [12] this way has one disadvantage that large number of parameters are needed to be set. [13] has same 
disadvantage as [10]. To overcome this disadvantage [14] uses another clustering algorithm that uses user defied 
parameters. It uses strict sorting based on source and target i.e. destination IP addresses and ports in alerts. [22] 
uses fully different and unique way for clustering, AA-NN i.e. auto associator neural network's error is 
reconstructed and it helps to analyze different alerts. Alerts which produces same reconstruction error are 
grouped or placed into same cluster. The major advantage of this approach is it can be applied to offline as well 
as online. Offline training is required to do first of all and that can be extended to online training of AA-NN. 

III. Online Intrusion Alert Aggregation Technique 

In this section, we will discuss our new alert aggregation approach. As we have already stated that it is 
probabilistic model of current situation of different types of attacks. First of all we start with architecture of our 
system. The architecture is consists of the diagram showing detailed view and description about the layers in 
detail. Then we will describe about the process of generation of alerts and the alert format i.e. what are the 
contents of alerts. After that we discuss about the clustering algorithm for offline alert aggregation and how to 
extend it to apply it online. At last we prepare result. Analyze it to produce remark for generation of meta alerts. 
Whatever meta alerts has been produced we will send it to users registered mobile. 
3.1 Architecture: 

The following figure shows the architecture of proposed system. 
3.1.1 Sensor Layer: It is low level layer which acts as an interface between the network and host (agent reside). 
It captures raw data from both i.e. from network and host, filters it and takes out essential data to create an 
event. Sensor layer consists of sensors which captures traffic on network 
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Fig 1. Layered Model of Proposed System 

3.1.2 Detection Layer: This layer consists of different types of detectors e.g. Support Vector Machines, Snort. 
It looks for misuse detection and anomaly detection. If it finds suspicious behaviors, it create alerts and forward 
to next layer of our proposed architecture i.e. alert processing layer. 

3.1.3 Alert Processing Layer: Whatever alerts has been received from detection layers that alerts are processed 
at this layer in such a way that meta alerts is generated. This generation of meta alerts is done on the basis of 
attack instance information which includes source and destination IP address and possible type of attack. 

3.1.4 Reaction Layer: It is something like Intrusion Prevention System which prevents intrusions. Relative and 
appropriate action is taken for meta alerts produced by alert processing layer. 
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3.2 Alert Generation and Format: 

We have discussed the functions of each layer in proposed system. Now we will discuss process of 
generation of alerts in detail. Sensors in sensor layer captures traffic over the internet. It also decides attributes 
as an input for detector in detection layer. This attribute can be used for differentiation of attack instances. They 
may be dependent or independent. Attributes generated by the detectors are source IP address, target IP address, 
and possible type of attack which includes denial of service, buffer overflow and SQL injection etc. The format 
for alert(A) is as follows: 

It has N number of attributes. Out of these N attributes let us suppose that N m are categorical and remaining i.e. 
N m+ i are continuous. 

A = (Ai, , A Nm , A Nm+ i, , An) 

Where, A 1? , A Nm are categorical attributes and 

A Nm +i, 9 A N are continuous attributes. 

3.3 Offline Alert Aggregation: 

In this section we develop offline alert aggregation which will be extended to data streaming for online 
alert aggregation. We can show that different attacks are done on TCP/UDP traffic. Some alerts are false 
positive and some alerts are false negative. All information then get analyzed and finally offline alert can be 
generated. But they have some drawbacks. 

1. Some of false alerts are not identified and they may get assigned to cluster. 

2. Wrong assignment of true alerts to cluster may happened. 

3. Splitting of cluster may be wrongly done. 

4. Many different clusters may get combined wrongly into one single cluster. 
Algorithm : Offline alert aggregation 

Input : set of alerts (A), 

number of components C 
Output : u c , g c 2 , p c parameters 

Assignment of alerts to components. 

1. nc=l/C 

2. Initiate a c 2 , p c 

3. While stopping not done do 

// E step : assign alerts to components 

4. For all alerts A (p) s A do 

C* := argmax H( a (p) I u c , c c 2 ,p c ) 

5. ce{l, ,C} 

6. Assign alert a (p) to C* 

// M step : updating of model parameters. 

7. For all component c £ {1, ,C} do 

8. N c := No. of alerts assigned to C 

9. For all attributes n 8 {1, ,N m } do 

10. p cn : = l/N c □ a, (p) 

11. for all attribute n 8 { N m+ i, , N } do 

12. Li cn : = l/N c □ a, (p) 

13. a cn 2 : = l/N c □ U (p) - Li™) 2 

We can conclude from above algorithm that this algorithm performs steps like initialization of model 
parameters, assignment of alerts to components, updating of model parameters stopping process, coefficient 
mixing. Next it adds alerts to components slowly. 

3.4 Online Alert Aggregation: 

The above algorithm is extended to perform online alert aggregation. For this IDS should have 
component adaption, component creation and component detection. In component adaption attack instances 
must be identified and should get assigned to proper cluster. In component creation new attack should created 
and parameters should set. In component detection attack instances should be detected. 
Algorithm: online alert aggregation 
Input : buffer B, Partition P, cluster number j 
Output : Lij 5 Gj 2 , pj parameters 

Assignment of alerts to components. 

1. B:=$ 

2. While new alert do 



I IJMER I ISSN: 2249-6645 I 



www.ijmer.com 



I Vol. 4 I Iss.7l July. 2014 I 901 



Online Intrusion Alert Aggregation With Generative Data Stream Modeling 



3. 


IfP: = Othen 




4. 


Pi : = { a} 




5. 


P : = { Pi } 




6. 


Initiate parameters like u, g 2 




7. 


Else 




8. 


P' :=P 




9. 


J* : = argmax H(uj, a/, pj) 




10. 


p/ : = pj U { a } 




11. 


Oj : = 1 C/l 




12. 


For all attributes n 8 {1, 


.,N, 


13. 


p jn : = l/O j(n) □ a/ p) 




14. 


For all attributes n s { N m+ i,. 




15. 


u jn : = l/O j(n) □ a/ p) 




16. 


a jn 2 : = l/O j(n) □ (a^-Uj, 


,) 2 


17. 


if Q( p) < □ 




18. 


P: = P' 




19. 


B:=BU{a} 




20. 


If novelty ( a ) then 






P : ALG3 ( C, J*, B) 






B : =$ 






Forjs{l, | C|} do 






If obsoleteness ( Pj) then 






P : = P/ Pj 





do 



,N} do 



IV. Implementation And Results 

We have implemented custom simulator by using java programming language. System requirement to 
do the implementation is JDK 1.6, Eclipse or Netbeans, JME. The operating system used to do the 
implementation is Windows XP. We have developed graphical user interface by using swing application 
programming interface. Following are some user interface of attack simulation. 
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Flooding: 


□ Buffer Overflow 

□ Denial of Services(DOS) 


Information Gathering: 


□ Sniffing 

□ Port Scanning 


Malware: 


□ Viruses 

□ Worms 

□ Trojan Horses 


Authendication Bypass: 


□ Password Attacks 

□ Resource Exhaustion 



Figure 2. Different Types of Attacks 



As shown in figure different attacks can be simulated into information gathering, authentication failure, 
malwares, and flooding of data. Following is the GUI for alert aggregation 
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MSB 




Figure 3 Simulation of Alerts 

As shown in above figure there is separate space for each and every layers aggregation messages. 
When attack is done the relevant or appropriate action or message is displayed as shown in figure 

Alerts can be send to users registered 
mobile as shown in figure. 
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Flooding: 



□ Buffer Overflow 

□ Denial of Services(DOS) 



Information Gathering: 




Malware: 



Figure 4. Response when attack is done 
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Figure 5. GUI of Mobile Alert 



V. Conclusion 

The proposed way for online alert aggregation generation has been implemented and it found that meta 
alerts can be generated. Missing false positive rate gets reduced as it uses property of data streaming i.e. it 
executes a few times only. The experimental result shows that it is very effective and helpful when it gets 
implemented in real time application. Also IDS accuracy gets increased. More alerts can be detected but 
compare to number of attacks detected very few false positive alerts gets introduced. So online intrusion alert 
aggregation with data streaming system is extremely efficient in information technology field to provide 
security to information. 
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